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Abstract 

Functional neuroimaging is a useful approach to study the neural correlates of visual 
perceptual expertise. The purpose of this paper is to review the functional-neuroimaging 
methods that have been implemented in previous research in this context. First, we will 
discuss research questions typically addressed in visual expertise research. Second, we 
will describe which kinds of stimuli are employed and which functional-neuroimaging 
techniques are implemented in this kind of research, with a special focus on 
electroencephalography (EEG) and functional magnetic resonance imaging (fMRI). 
Third, we will summarize the outcomes of recent studies that addressed the neural 
correlates of visual expertise and will particularly focus on studies that examined the 
neural correlates of visual expertise in medical image diagnosis. Finally, the review 
closes with a discussion of the benefits, caveats, and future directions of cognitive- 
neuroscience research for studying visual expertise. 
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1. Introduction 

Expertise can be defined as maximal adaptations to task constraints (Ericsson & Lehmann, 1996; 
Gruber, Jansen, Marienhagen, & Altenmueller, 2010) which can take many forms, including, among others, 
motor expertise, memory expertise, or perceptual expertise (Ericsson & Lehmann, 1996). Perceptual 
expertise can be further categorized as visual, auditory, tactile, olfactory, vestibular, or gustatory expertise. 
Visual expertise is evident, for example, when bird experts classify a passing little bird as an oriole or a 
cardinal (Tanaka & Curran, 2001) or when clinicians diagnose digitized slides of human tissue as 
pathologically normal or abnormal (Helle et al., 2011). Assuming that individual differences in visual 
perceptual expertise should be reflected in differences in the brain, the following question arises: Can we 
reliably measure/objectify neural correlates of visual expertise with currently available functional¬ 
neuroimaging methods and therewith explain inter-individual behavioral differences with respect to visual 
perceptual expertise? 

In line with the overall goal of this special issue to introduce and discuss methodological approaches 
in visual expertise research (Gegenfurtner & Van Merrienboer, 2017), the purpose of the present 
methodological review is to reflect on the promises and pitfalls of cognitive-neuroscience methods in the 
study of visual perceptual expertise. While the review can offer input for discussions among scholars 
experienced in conducting neuroscientific studies, the manuscript is mainly written to inform scholars who 
are unfamiliar with the methodological repertoire of functional neuroimaging and its use in expertise 
research. In this review, we will particularly address expertise in medical image diagnosis, which can be 
defined as the inspection and interpretation of a visual representation of the human anatomy or its functions 
(Gegenfurtner, Kok, Van Geel, De Bruin, Jarodzka, Szulewski, & Van Merrienboer, 2017); but because this 
body of research is still limited and in its infancy, we will extend our review to other content domains with 
the aim of offering a more useful overview of current methodological decisions in the visual perceptual 
expertise literature. There are already several systematic reviews available on the neural aspects of visual 
perceptual expertise (for example, Richler & Gauthier, 2014, for face perception or Gegenfurtner, Siewiorek, 
Lehtinen, & Saljo, 2013, for medical image diagnosis). The present review has a particular emphasis on 
implementing cognitive-neuroscience (especially functional-neuroimaging) methods on visual perceptual 
expertise, and will follow four steps. First, we will start with a short discussion of typically addressed 
research questions. Second, we will describe which kinds of stimuli are employed and which functional¬ 
neuroimaging methods are used, with a special focus on the frequently implemented methods EEG and 
fMRI. Third, we will summarize the outcomes of studies that addressed the neural correlates of visual 
perceptual expertise. And finally, we close this review with a discussion of the benefits, caveats, and future 
directions of cognitive-neuroscience research for studying visual expertise. 


2. Research questions 

Research on visual perceptual expertise has focused on a wide range of different research questions. 
These research questions can be clustered in three distinct types: contrastive, developmental, and 
conditional. Naturally, research questions strongly correspond with the research design. For example, 
contrastive research questions ask how participants of different levels of expertise vary in different neural 
measures. In a classic study, Haller and Radue (2005) were interested in examining “neuronal activations 
during processing of radiologic and non-radiologic images by experienced radiologists and non-radiologist 
subjects by using event-related functional magnetic resonance (MR) imaging” (p. 983). This is a 
representative example for the first type of research questions (contrastive research questions). A second 
type of research questions, developmental research questions, asks how participants neurally adapt to visual 
perceptual training. These studies typically employ a paradigm implementing a training of inexperienced 
participants over the course of several weeks. For example, Gauthier and colleagues (1998) were interested 
in examining if increased experience with so-called ‘Greebles’, artificially created stimuli (see Figure 1), 
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would yield to an increase of fMRI activation in a particular brain region, the so-called ‘fusiform face area’ 
(outcomes of this study presented and discussed below). Finally, the third type of research questions, 
conditional research questions, addresses the extent to which expertise effects - be they contrastive or 
developmental - are contingent on task conditions such as the duration of stimulus presentation or different 
manipulations of the presented stimuli. For example, Bilalic and colleagues (2016) were interested in 
unravelling if expertise effects are moderated by the orientation of the presented stimulus, in their case, X- 
ray films showed either in a normal, upright position or in an inverted position (rotated by 180°). 
Comparability across studies depends on the used research question and design. When designing a cognitive- 
neuroscience study, one can follow a single research question or several research questions even from 
different research-question types (see above). Typically, conditional research questions that address the 
moderating effect of stimulus or task conditions are often combined with contrastive or developmental 
research designs. 


3. Methodology in visual perceptual expertise research 

In this section, we review established methods implemented in cognitive-neuroscience studies in the 
field of visual perceptual expertise. We first describe frequently used artificial and naturalistic stimuli. We 
then look at the methodology of fMRI and EEG, in particular, on what kind of information can be derived 
from fMRI and EEG signals, and we also outline other, less frequently used techniques in cognitive 
neuroscience. 


Smoothies 





3.1. Stimuli 

3.1.1. Artificial stimuli 

Artificially created stimuli are objects that have no common reference in the real world. This is a 
deliberate choice to avoid any confounding effects that may be induced from familiarity with the object. 
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Several groups of artificial stimuli have been introduced. Some of those used more frequently are 
‘Smoothies’, ‘Spikies’, and ‘Cubies’, and, perhaps most prominently, Greebles (mentioned above). 
Smoothies, Spikies, and Cubies are Matlab-generated classes of objects that “were designed to have different 
shape properties and to seem novel (i.e., they did not immediately suggest associations with everyday object 
categories” (Op de Beeck, Baker, DiCarlo, & Kanwisher, 2006, p. 13025). Figure 1 shows example 
Smoothies, Spikies, and Cubies. These artificial stimuli were created with variations of different dimensions, 
so that participants need to process more than one location of the object to attain high rates of discrimination. 


Greebles are objects specifically constrained to be similar to faces along several dimensions. 
Figure 2 shows example Greebles. Greebles are photo-realistically rendered, three-dimensional, computer¬ 
generated objects that all share a common configuration. As Gauthier, Williams, Tarr, and Tanaka (1998) 
explain: “Each Greeble is made up of a vertically-oriented ‘body’ with four protruding ‘appendages’, from 
top to bottom, two ‘boges’ a ‘quiff and a ‘dunth’” (p. 2402). Greebles come in two different genders (called 
“glip” and “plok”) and five families (called “galli”, “osmit”, “radok”, “samar”, and “tasio”). Greebles have 
been used in a range of studies using both fMRI and EEG. 
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Figure 2. Examples of Greeble objects in their two genders and five families (open source from: 
https://commons.wikimedia.Org/wiki/Category:Greeble). 


3.1.2. Real-world stimuli 

In opposition to these artificial stimuli that share little to no resemblance with naturally occurring 
objects, researchers also use real-world stimuli. These stimuli are called “real-world” to indicate that these 
objects are not researcher-generated. Associated with real-world stimuli is the assumption that there are real- 
world experts that have developed visual skills related to these objects (Shen, Mack, & Palmeri, 2014), so 
these material are used in an attempt to create ecologically valid domain-specific tasks. Real-world stimuli 
can be classified as faces and non-face objects. First, photographs of faces (or of parts of faces) are 
extensively used as stimuli in visual perceptual expertise research because we have so much exposure to 
faces that this makes us all experts in face recognition (Bentin, Allison, Puce, Perez, & McCarthy, 1996; 
Richler & Gauthier, 2014). Second, photographs of non-face objects includes cars (Gauthier, Skudlarski, 
Gore, & Anderson, 2000), different animal species such as birds (Tanaka, Curran, & Sheinberg, 2005) or 
dogs (Tanaka & Curran, 2005), and also letters such as Japanese (Maurer, Zevin, & McCandliss, 2008) and 
Chinese characters (Fan, Chen, Zhang, Qi, Jin, Wang, et al., 2015; Qi, Wang, Hao, Zhu, He, & Luo, 2016). 
Researchers also use representations of chess positions (Bilalic, Langner, Ulrich, & Grodd, 2011) and 
medical images (Haller & Radue, 2005). In many studies, these real-world stimuli are presented either in 
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original form or in inverted, rotated, or otherwise artificially distorted. The rationale behind these artificial 
manipulations is to complicate and change pattern recognition for expert participants. Typically, real-world 
stimuli in these studies are static and two-dimensional. If we assume that the comprehension of 
visualizations is moderated by variations in dimensionality and dynamics (for a meta-analysis testing this 
assumption, see Gegenfurtner, Lehtinen, & Saljo, 2011), then it seems surprising that the literature on the 
neural correlates of real-world visual perceptual expertise has not yet systematically compared how the brain 
processes of experts and novices differ when they view static vs. dynamic stimuli or two-dimensional vs. 
three-dimensional visualizations. 


3.2. Apparatus 

While viewing different kinds of stimuli, participants’ neural correlates can be measured with 
cognitive-neuroscience techniques. Measuring these neural correlates is contingent on the study interests and 
research questions. Typically, if researchers are interested in the temporal aspects of image processing, they 
use electroencephalography (EEG). Conversely, if researchers are interested in the spatial aspects of image 
processing, they use functional magnetic resonance imaging (fMRI). In addition to EEG and fMRI, there are 
also several other measurement techniques, including magnetoencephalography (MEG), positron emission 
tomography (PET), and functional near-infrared spectroscopy (fNIRS). Offering detailed descriptions of 
each of these techniques is beyond the scope of this review. Ward (2006) and Squire and colleagues (2013) 
offer easy-to-understand introductions. But it is informative here to briefly describe the two most frequently 
used techniques to illustrate how they work and what they measure. These are EEG and fMRI. 

3.2.1 EEG 

Neurons communicate through electrical signals transmitted along axons and dendrites. When 
populations of neurons that are oriented in parallel are synchronously active, their electrical signals can be 
measured with electrodes placed on the scalp. Electroencephalography (EEG) records and amplifies these 
electrical signals over time. When we perceive a picture, particular populations of neurons in our brain 
respond to this picture. This response is measurable as a change in voltage at the scalp before, while and 
after seeing the picture. If we average the recorded EEG signal across many trials, random brain activity that 
is unrelated to the neural processing of the picture is cancelled out. The relevant (stimulus-related) signal is 
preserved and called the ‘event-related potential’ (ERP). When recording EEG from participants while they 
looked at pictures of faces, Bentin and colleagues (1996) found a negative event-related potential (N) that 
reached its maximum at approximately 172 ms (N170) after picture onset. Since this pioneering study, the 
N170 has become a widely studied EEG component in cognitive neuroscience. EEG measures have a high 
temporal resolution and are therefore time-sensitive. Thus, EEG can be especially used to investigate 
temporal patterns of brain activity. However, EEG has a relatively low spatial resolution meaning that the 
localization of the EEG signal source (i.e., the location of the specific neuronal populations evoking the 
electrical brain activity) cannot be ascertained with high precision. 

3.2.2 fMRI 

FMRI indirectly measures neural activity through its vascular response: Following (e.g., visual) 
stimulation, neuronal activity in particular (e.g., visual) brain regions increases which results in enhanced 
local oxygen consumption. Neuronal tissue gets new oxygen from the oxygenated hemoglobin in the blood. 
Within a few seconds, the blood flow and the concentration of oxygenated hemoglobin in the blood increases 
in the particular brain region. This increase is called the hemodynamic response. Since oxygenated and 
deoxygenated hemoglobin have different magnetic properties, the hemodynamic (or the blood oxygenation 
level-dependent, BOLD) response can be imaged using fMRI. Note that the hemodynamic response is 
considerably delayed and expanded which puts some constraints when designing fMRI experiments. 
Compared to EEG, the temporal resolution of fMRI is rather low (one data point is normally obtained within 
1 -2s). However, the spatial resolution is considerably higher (in the mm 3 range) meaning that fMRI can 
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provide specific information about the origin of the brain signal and therewith information about which part 
of the brain is involved in a particular activity (e.g., visual perception). Note that fMRI only measures 
relative (and not absolute) changes of the oxygenation level and that fMRI visualizations are actually 
representations of statistical differences of the fMRI signal across different experimental conditions. 

In summary, EEG has a very high temporal resolution and is therefore a suited method to investigate 
timing of brain activity. FMRI has a much higher spatial resolution than EEG and is an appropriate method 
to indicate which brain regions are involved in a particular (e.g., perceptual) task. 


4. Results of visual expertise research 

This section presents the outcomes of studies addressing neural correlates of visual perceptual 
expertise. How does the development of expertise change temporal and spatial aspects of information 
processing? First, we summarize the findings of EEG research. Second, we review the fMRI findings. And 
finally, in a special section, we zoom in on the relatively new field of cognitive-neuroscience research 
applied to medical image diagnosis. 

4.1. EEG research: The N170 

Using ERPs based on EEG measurements, cognitive-neuroscience research has provided strong 
support for the idea that a particular early ERP component, namely the N170 (introduced above), plays a 
significant role when participants process photographs and pictures of faces (Bentin et al., 1996; for a meta¬ 
analysis of this research, see Hinojosa, Mercado, & Carretie, 2015). Interestingly, it could be demonstrated 
that patients who suffer from face blindness, also called prosopagnosia (the inability to recognize faces), did 
not show this larger magnitude of the N170 component when processing faces (for reviews, see Richler & 
Gauthier, 2014; Towler, Fisher, & Eimer, 2017). This body of evidence on face processing has inspired 
research on perceptual expertise because of the assumption, in part, that all humans are ‘face experts’. If the 
N170 was such a stable neurophysiological marker in face perception, would the enhanced N170 also reflect 
expert processing of other familiar, domain-specific objects? A pioneering study by Tanaka and Curran 
(2001) confirmed this hypothesis. EEG was recorded while participants viewed photographs of cars or birds. 
Approximately 164 ms after stimulus onset, participants who were car experts showed a larger N170 
component for cars compared to birds, and participants who were bird experts showed a larger N170 
component for birds compared to cars. Tanaka and Curran (2001) carefully controlled for stimulus artefacts 
including image properties and task instruction, and also for group effects in that the same participants 
viewed photos of cars and birds and were thus expert and novice in different trials of the experiment. In 
summary, this study revealed that visual perceptual expertise is associated with an enhanced N170 
component and therewith with very early stages of visual information processing. In recent years, this effect 
has been replicated with both car (Gauthier & Curby, 2005; Scott, Tanaka, Sheinberg, & Curran, 2008) and 
bird stimuli (Scott, Tanaka, Sheinberg, & Curran, 2006; Tanaka et al., 2005). Research also disclosed the 
expertise effect on N170 using artificial stimuli including Blobs (Curran, Tanaka, & Weiskopf, 2002) and 
Greebles (Rossion, Gauthier, Goffaux, Tarr, & Crommelinck, 2002; Rossion, Rung, & Tarr, 2004) and with 
non-object letter symbols, including Japanese (Maurer et al., 2008) and Chinese characters (Fan et al., 2015; 
Qi et al., 2016). In summary, EEG studies suggest that, similar to face perception (Hinojosa et al., 2015; 
Richler & Gauthier, 2014; Towler et al., 2017), visual expertise modifies the temporal aspects of information 
processing related with an enhanced N170 component for trained or domain-specific objects. 
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4.2. fMRI research: Revealing the functional role of the FFA 

In 1997, Kanwisher and colleagues located a brain region in the fusiform gyrus that is strongly 
activated when humans view faces. This region was called the fusiform face area (FFA). Two years later, in 
1999, Gauthier, Tarr, Anderson, Skudlarski, and Gore demonstrated that the FFA is not only activated when 
viewing faces but also indicates the level of expertise with artificial objects (in this case Greebles). The 
assumption was that the selectivity of FFA reflects a more generalized form of visual perceptual expertise 
that is not intrinsically specific or restricted to processing face stimuli (Tarr & Gauthier, 2000). This 
assumption was confirmed with bird (Gauthier et al., 2000), car (Gauthier et al., 2000), and artificial stimuli 
(Gauthier et al., 1999). However, an early criticism of these studies was that these stimuli were similar to 
faces: Indeed, parts of Greebles evoke resemblance to faces, birds have faces, and also cars, at least in three- 
quarter frontal views, resemble faces (Kanwisher, 2000; Grill-Spector, Knouf, & Kanwisher, 2004). The 
conclusion was, thus, that FFA activation was more likely the result of face similarity than object expertise. 
To minimize the effect of faces, Xu (2005) used side view photographs of birds and cars, and reported that 
visual perceptual expertise was still associated with FFA activation. Since then, a rich plethora of fMRI 
studies supported Gauthier’s initial assumption that visual expertise in object perception was associated with 
activation in the FFA independent of face similarity (Bilalic et al., 2011; Bukach, Gauthier, & Tarr, 2006; 
Palmeri & Gauthier, 2004; Righi, Tarr, & Kingon, 2013; but see Bartlett, Boggan, & Krawczyk, 2013, for a 
study that did not find differences between experts and novices in FFA activation in a chess task. In that 
study, artificially inverted and distorted chess stimuli were used, so it is a matter of debate if these stimuli 
were suitable to trace chess expertise). In recent years, the discussion around face selectivity tended to be 
replaced with a more recent discussion whether FFA was the only region relevant for processing familiar 
objects or whether visual perceptual expertise was associated with the interaction between different brain 
regions (e.g., Bilalic, Langner, Campitelli, Turella, & Grodd, 2015; Harel, Kravitz, & Baker, 2013; Wong & 
Wong, 2014). In short, there seems to be broad consensus in the field that the processing of objects involves 
more than just FFA. Specifically, Wong and Wong (2014, p. 308) explain that “perceptual expertise 
researchers have been considering the interaction between perceptual and cognitive processing as an 
important component in understanding perceptual expertise for different objects. It is therefore unnecessary 
to create the debate between the so-called “perceptual view” and “interactive view” of expert object 
recognition, as the interaction between perceptual and cognitive processing has been well accommodated in 
perceptual expertise research.” Overall, studies using fMRI demonstrated that when experts view domain- 
specific stimuli, the FFA and other brain regions are activated. The precise location of these “other” brain 
regions and their particular interaction patterns with FFA, however, are still under investigation. 
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Table 1 

Studies examining neural correlates of expertise differences with medical images as stimuli 


First author (year) 

Participants 

Stimulus 

Task (apparatus) 

Main findings 

Bilalic (2016) 

16 radiologists, 

15 students 

Upright or inverted 
chest X-ray films, 
photographs of faces, 
rooms, and tools 

Viewing task: 

1-back task (fMRI) 

Expertise effect in 

FFA 

Fiorio (2010) 

8 clinicians, 

10 students 

Photographs and 

videos of healthy and 
dystonic writing 

movement 

Decision task: 

Judgment if and to 
what extent writing 
was dystonic (TMS) 

Corticospinal acti¬ 
vation in students, 
but not in experts 

Haller (2005) 

12 radiologists, 

12 laypersons 

Original and mani¬ 
pulated radiologic 

images 

Detection task: 
finding manipula¬ 
tions on the image 
(fMRI) 

Increased activa-tion 
in temporal and 
frontal gyri in 
radiologists 

Harley (2009) 

7 radiologists, 

6 4th-yr residents, 

7 lst-yr residents 

Normal and abnor¬ 
mal chest X-ray films 

Detection task: 
finding nodules 
(fMRI) 

Positive correlation 
of FFA activity with 
expertise 

Melo (2011) 

25 radiologists 

Chest X-ray films 
that included lesions, 
animals, or letters 

Detection task: 
finding lesions, 

animals, or letters 
(fMRI) 

Activation in left 
inferior frontal sul¬ 
cus and posterior 
cingulate cortex 

Ribas (2013) 

29 radiologists 

Veterinary X-ray 

films 

Decision task: 
choose between four 
diagnosis options 
(EEG) 

Positive correlation 
of expertise with 
electrode activity C4, 
F3, F8, OZ, T6 


4.3. Visual expertise in medical image diagnosis 

Gegenfurtner, Siewiorek, Lehtinen, and Saljo (2013) reviewed the literature on visual expertise in 
relation to medical image diagnosis and identified three of 21 studies that examined neural correlates of 
expert-novice differences when inspecting medical visualizations (Haller & Radue, 2005; Fiorio et al., 2010; 
Harley et al., 2009). Since the review of Gegenfurtner et al. (2013), two additional studies were published 
that addressed the neural basis of visual perceptual expertise in medicine (Bilalic et al., 2016; Ribas et al., 
2013). We briefly review these studies here, together with an additional paper (Melo et al., 2011) that 
examined the neural correlates of radiologists’ diagnoses. Although Melo et al. (2011) did not analyse the 
effect of expertise, the study is relevant in the current context as it discusses the involvement of brain regions 
when deriving diagnoses from medical visualizations. Table 1 offers an overview of the six studies. 

Bilalic and colleagues (2016) asked radiologists and medical students to indicate if the current 
stimulus they were seeing was the same as the previous one. Stimuli were chest X-ray films that were either 
presented in upright position or rotated by 180° (inverted), as well as stimuli including photographs of faces, 
rooms, and tools. The findings suggest that the FFA of radiologists compared to medical students was more 
sensitive in differentiating upright or rotated X-ray films from the photographs showing rooms and tools. 
Bilalic et al. (2016) conclude that the FFA activation was likely associated with the level of participant 
expertise effect. Also Harley and colleagues (2009) found a positive correlation between FFA activation and 
the visual expertise of radiologists. However, Harley et al. (2009) also reported that activity in the right FFA 
did not differ between radiologists and first-year residents looking at radiological images. Haller and Radue 
(2005) presented radiologic images (computer tomography scans, magnetic resonance images, and 
ultrasound pictures) that were either original or manipulated to radiologists and non-radiologists. The 
participants were asked to decide if the presented images were original or manipulated. The group of 
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radiologists showed significantly stronger activation than the group of non-radiologists in the bilateral 
middle and inferior temporal gyrus, bilateral medial and middle frontal gyrus, and left superior and inferior 
frontal gyrus—regions that are allegedly associated with visual attention and memory retrieval (Wager & 
Smith, 2003). Haller and Radue’s (2005) study is interesting because it is the first to indicate that different 
brain regions interact when experts visually process medical images. The findings of Melo et al. (2011) and 
Ribas et al. (2013) further support this notion. Particularly, using EEG, Ribas and colleagues (2013) report 
that participations with higher levels of expertise had more electric activity compared to participants with 
lower levels of expertise. Fiorio et al. (2010) used transcranial magnetic stimulation (TMS) to examine how 
participants differed when viewing photographs and short video sequences of healthy and dystonic writing. 
Briefly, in TMS, a magnetic field generator is placed in close proximity to the head of a participant in order 
to evoke electric currents in brain areas (for introductions to TMS, see Walsh & Cowey, 2000; Ward, 2006). 
The authors showed that “observation of pathological actions differently modulates the viewer’s motor 
resonant system, depending on previous knowledge, visual expertise, and ability to recognize sub-optimal 
movement kinematics” (p. 698). Fiorio and colleagues (2010) used dynamic stimuli, which is still rare in the 
field of cognitive-neuroscience methods applied to medical diagnosis. 

On the basis of the studies reviewed here, it seems safe to conclude that expertise in medical 
diagnosis cannot be located in and isolated to a single brain area but, instead, expertise seems to be 
associated with changes in activation in a multitude of neural regions as a function of experience, amount of 
training, and knowledge structures. We should note, however, that this interpretation is contingent on the 
level of task complexity in the original studies. It seems likely that more brain regions are activated when the 
task is more complex, while many studies employ simplified versions of the task of medical image diagnosis. 
The six studies reviewed in Table 1 differ in their task complexity. The complexity of the employed tasks 
was categorized following the four-level model of task complexity in the comprehension of visualizations 
(Gegenfurtner et al., 2011) shown in Table 2. This model defines task complexity on the basis of contextual 
demands that differ as a function of the number of desired outcomes, the multiplicity of paths to attain 
desired outcomes, and the coordinative complexity of informational cues in the task material while moving 
toward task completion. The reviewed studies include one viewing task, in which participants had to say if 
they had just seen the same image (Bilalic et al., 2016); three detection tasks, in which participants were 
asked to search for an abnormality or specific target within the image (Haller & Radue, 2005; Harley et al., 
2009; Melo et al., 2011); and two decision tasks, in which the participants had to choose among a given set 
of options (Fiorio et al., 2010; Ribas et al., 2013). The study by Fioro et al. (2010) is the only one using TMS 
and the study by Ribas et al. (2013) the only one using EEG; thus, findings from these two studies cannot 
easily be compared to the other four studies using fMRI. Somewhat surprisingly, to date, no study has asked 
participants to produce a full diagnosis from a presented visual material, perhaps because tasks inside a 
magnetic resonance scanner are kept deliberately simple and diagnostic problem-solving tasks would be too 
complex; even a simple blink causes severe artifacts in electroencephalograms. Typing is practically 
impossible, and speaking might be hard to record due to the noise made by the scanner. Furthermore, if the 
task is too complex in comparison to the control task, this might lead to differences in brain activation that 
are so widespread that it might no longer be able to meaningfully interpret them. This explains perhaps the 
scarcity of cognitive-neuroscience studies in medical image diagnosis relative to its wide application in the 
visual perceptual expertise literature. 
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Table 2 

Four-level model of task complexity in the comprehension of visualizations (adapted from Gegenfurtner, 
Lehtinen, & Sdljo, 2011) 


Task type 

Multiplicity of 
solution paths 

Number of 
desired outcomes 

Coordinative 

complexity 

Example 

Viewing task 

Low 

Low 

Low 

Looking at 
medical images 

Detection task 

Low 

Low 

High 

Searching for lung 
nodules 

Decision task 

Low 

High 

High 

Deciding between 
given options 

Problem-solving task 

High 

High 

High 

Generating a 
diagnosis 


If we compare the studies using medical images as experimental stimuli (reviewed in Table 1) with 
the wider visual perceptual expertise literature and their findings of FFA activation and the enhanced N170 
component, it is evident that the expertise effect in FFA was partially confirmed with medical images 
(Bilalic et al., 2016; Harley et al., 2009). An increased N170 component has not yet been systematically 
addressed. It is very encouraging that, since our review some years ago (Gegenfurtner et al., 2013), more and 
more cognitive-neuroscience studies using fMRI or EEG emerge that address medical image diagnosis. We 
do expect that future research will proliferate in this area in an attempt to replicate FFA activation and N170 
enhancement as neural correlates of visual expertise in the medical domain. These studies will help us 
understand how medical expertise changes temporal and spatial brain activation patterns associated with the 
diagnosis of medical visualizations. 


5. Discussion 

After reviewing typical research questions and experimental stimuli, describing EEG and fMRI, and 
reporting the current state of the neural correlates of visual perceptual expertise, this section will now 
elaborate on the advantages and limitations of cognitive-neuroscience research in the current context. What 
are the benefits of using fMRI and EEG? What are caveats of these methodologies? And what are directions 
for future research that originate from this review? 

5.1. Benefits 

The benefits of applying cognitive-neuroscience methods in research on visual perceptual expertise 
relate to the extension of behavioural research, high temporal and spatial sensitivity, and high levels of 
control. We elaborate on each of these benefits in turn. First, cognitive neuroscience can extend behavioural 
research. In particular, cognitive neuroscience affords different units and levels of analysis; these, in turn, 
make visible some of the neural correlates underlying cognitive processes that would not be accessible with 
behavioural measures (Ansari, De Smedt, & Grabner, 2012). Framing this triangulation, Stern and Schneider 
(2010) introduced the metaphor of a digital road map: with cognitive neuroscience, researchers can zoom in 
to the neural levels of cognition and perception and examine processes inside the human brain. If researchers 
are interested in these processes, EEG and fMRI offer measures that can unveil neural activation as the basis 
for observable expertise differences (Gegenfurtner et al., 2013; Gruber et al., 2010). 
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Another advantage of many cognitive-neuroscience methods is their very high temporal and spatial 
resolution. More precisely, EEG takes measures in the range of milliseconds. Thus, if we are interested in the 
temporal aspects of expert performance, then time-sensitive EEG is a very suited method, especially if 
measured in parallel with pupillometry (Szulewski, Gegenfurtner, Howes, Sivilotti, & Van Merrienboer, 
2017) and eye tracking (Holmqvist, Nystrom, Anderson, Dewhurst, Jarodzka, & Van de Weijer, 2011; 
Jarodzka, Jaarsma, & Boshuizen, 2015). In contrast, fMRI has a unique capability in locating very precisely 
the brain regions that are active, e.g., when participants of varying levels of expertise interpret complex 
images. If research aims to uncover where and when neural activity occurs during expert performance, then 
EEG and fMRI (at best in combination) are two very powerful, non-invasive methodological tools. 

Finally, because of the extremely high sensitivity of EEG (temporal) and fMRI (spatial), experiments 
in cognitive neuroscience are typically very controlled. These levels of control afford high levels of external 
validity (Ansari et al., 2012; De Smedt, 2014). Because researchers invest a considerable amount of time and 
energy in securing experimental control, including a strict selection of participants (for example: only right- 
handed people) and carefully filtered stimulus materials (exemplarily reflected in the huge effort of creating 
Greebles), findings from EEG and fMRI often result in stable, generalizable inferences. 

These generalizable inferences can inform researchers when developing theories of visual expertise. 
Because cognitive-neuroscience methods have high levels of temporal and spatial sensitivity, as well as 
experimental control, neural correlates of visual expertise can be used in theory testing and development 
(Bilalic et al., 2015). Neuroscientific findings thus have the potential to inform expertise research in two 
ways. First, they can be used to test the predictive validity of existing models and theories, for example on 
how expertise develops in novices (Kok, De Bruin, Robben, & Van Merrienboer, 2012; Van Geel, Kok, 
Dijkstra, Robben, & Van Merrienboer, 2017), intermediates (Boshuizen & Schmidt, 1992; Ericsson & 
Lehmann, 1992), and experts (Gegenfurtner, 2013; Gegenfurtner, Nivala, Lehtinen, & Saljo, 2009). Second, 
they can be used to develop novel theories to account for expertise differences revealed by methods of 
cognitive-neurosciences; differences that would have remained unobservable with behavioral methods alone 
(Bilalic et al., 2015). 

5.2. Caveats 

No method comes without limitations. Powerful and elegant as cognitive neuroscience may appear, 
its methodology also includes different costs that can compromise the available evidence. Caveats include 
the temporal and spatial resolution, ecological validity, a reductive bias, and limited implications for 
educational practice. First, and perhaps surprisingly, the extreme sensitivity of EEG and fMRI measures, 
positive on one side, introduces of course a number of limitations to the experimental setup. For example, in 
EEG research, already the slightest motions like a blink or moving the nostrils creates severe data artifacts. 
Research projects thus often lose a considerable amount of data because participants had not been motionless 
enough while their neural activity was recorded (Ansari et al., 2012; De Smedt, 2014). This is particularly 
detrimental if we consider the financial costs of data collection and if we consider that cognitive 
neuroscience often works with small sample sizes. To cover for this data loss, even higher and stricter 
experimental controls are developed, implemented, and employed. The fact that EEG and fMRI are so 
sensitive to small motions causing artefacts means that the ecological validity is easily compromised: the 
sensitivity to motion restricts the possibilities for ecologically valid experiments. 

This is related to compromises in the ecological validity of cognitive neuroscience experiments. 
Experts are not typically motionless or work inside the tube of a 3-tesla MRI scanner. It is thus a matter of 
debate to which extent cognitive neuroscience can reflect the complexity of processes and practices 
associated with real-world expertise. Do we force experts to act in too artificial ways? Can we capture how 
experts diagnose a patient case when we show them a chest X-ray in a rotated, blurred, or otherwise distorted 
mode, for the duration of only a few seconds? The high level of experimental control, that is clearly a benefit 
of cognitive neuroscience, comes at the same time with limitations to ecological validity. The limited 
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ecological validity is also associated with tasks that are typically used. Instead of complex problem-solving 
tasks that would reflect medical diagnosis, many of the reviewed studies employed lower levels of task 
complexity (Gegenfurtner et al., 2011). Furthermore, study participants are asked to complete these tasks 
repeatedly in longer sessions to get readable signals, which can further compromise ecological validity. This 
is in line with De Smedt’s (2014) observation that tasks used in neuroscience “need to be very elementary, 
because the larger the number of cognitive processes in a particular task, the more difficult it will be to 
disentangle these cognitive processes physiologically.” 

Cognitive neuroscience is interested in the neural correlates of behavior, cognition, emotion etc. 
(Squire et al., 2013; Stern & Schneider, 2010; Ward, 2006). Epistemologically, from a neuroscience 
perspective, visual perceptual expertise tends to be reduced to changes in electrical activity or blood flow. 
While this can render fascinating findings, other important ingredients of expert performance are ignored. 
Certainly, all research is reductionist (Lehtinen, 2012; Saljo, 2009). One must make decisions what to 
measure because we simply cannot account for all relevant aspects in a single study, as interesting these 
aspects may be (Dam§a et al., 2017). Focusing on neural levels does not imply that we uncover the basis of 
human learning. One could easily argue that the basis is the social context within which we are situated 
(Gegenfurtner & Szulewski, 2017; Saljo, 2009). In describing this reductive bias, Lehtinen (2012) notes: 
“Because of the impressive technical development of brain research during the last two decades (...) many 
neuroscientists have quite a strong tendency towards downwards reductionism (emphasis in the original). 
This reductionism stems from the idea that research registering brain processes with complex technical tools 
finally opens up a real scientific approach to learning research.” Cognitive neuroscientists are well aware that 
EEG and fMRI are just two among the many other methods of learning research (e.g., De Smedt, 2014). 

This reductive bias is not only associated with limitations in how expertise is measured and 
methodically approximated; it also signals limitations in how expertise and performance are theorized and 
conceptually framed (Lehtinen, 2012; Saljo, 2009; Siewiorek & Gegenfurtner, 2010). More specifically, 
theories of visual expertise that are exclusively grounded on neuroscientific evidence risk to de-emphasize 
other facets of how expert performance is enacted and displayed in real-world activities and practices 
(Gibson, 1986; Goodwin, 1994; see also De Bruin, 2017; Gegenfurtner et al., 2017). This risk is of course 
inherent in all mono-method designs, largely because single method studies capture a limited number of 
units of analysis. Conversely, the combination of approaches in mixed-method or multi-method designs 
allow for the triangulation of units of analyses, which can inform a theory of visual perceptual expertise that 
encompasses different analytic levels beyond what is evident from single method approaches like EEG, 
TMS, or fMRI. While the benefits of bridging methods of expertise research are clear, methods are always 
part of a scientific community. These communities have agency as political actors and “defend” their 
methods against the influences of concurrent academic realms (Al Lily, Foland, Stoloff, Gogus, Erguvan, 
Awshar, et al., 2017), so it will be an interesting observation to see if and to what extent expertise 
researchers will (continue to) embrace methodological triangulations and combine cognitive neurosciences 
with other method approaches in their studies for the puipose of theory development. 

Related to that is a false belief that findings from EEG and fMRI would be directly applicable and 
informative for re-designing learning environments and curricula. Educational neuroscientists work hard to 
deemphasize the hopes that many practitioners have when they read that finally, once we understand the 
brain, we understand how learning, expertise, and education “work”. Ansari and colleagues (2012) write that 
“the most obvious question a teacher may ask is, ‘How will I be able to apply this knowledge?’ There is, in 
our view, no reason to expect that neuroimaging research, will determine directly how teaching should take 
place. This is considered by many ‘a bridge too far”’. Thus, cognitive or educational neuroscience may have 
a very limited impact on educational practices. Does this mean we should not conduct this kind of research? 
Certainly not; but we should, perhaps, rethink our expectations about what neuroscience measures can do 
(Ansari et al., 2012; De Smedt, 2014; Lehtinen, 2012; Saljo, 2009; Stern & Schneider, 2010). 
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5.3. Directions for future research 

Examining the neural correlates of visual expertise is a fascinating endeavour. This review has 
identified a small, still limited number of studies that examined how visual perceptual expertise in medical 
image diagnosis correlates with EEG, fMRI, and TMS measures. What are directions for future research that 
follow from this review? First, all but one of the reviewed studies in Table 1 used static pictures. Only Fiorio 
and colleagues (2010) used video sequences as stimuli. We thus recommend exploring and testing if and to 
what extent neuroscience-based visual perceptual expertise research can use dynamic stimuli. Second, future 
research can make more use of EEG as well as other neuroscience approaches such as TMS or MEG to study 
the neural correlates of visual expertise in medical image diagnosis. Another possible direction for future 
research is the combination of neuroscience methods with other online measures of expertise, including eye 
tracking and pupillometry (Holmqvist et al., 2011; Gegenfurtner & Seppanen, 2013; Kok et al., 2012; 
Szulewski et al., 2017) if the constraints of different temporal scales can be accommodated for. Such 
combinations would be interesting theoretically as a means to inquire how eye movements and neural 
activity correlate in expert diagnostic reasoning. Fourth, implications of cognitive-neurosciences for 
education and training need to be explored. To what extent can clinical practitioners and medical educators 
benefit from neuroscientific measures? This is a question that applies to the field of medical image 
perception more generally and is not exclusive to cognitive-neurosciences; for example, also eye tracking 
used to be criticized for not being relevant enough to medical education and training, but has demonstrated 
its benefits in the form of eye movement modeling examples (Jarodzka, Balslev, Holmqvist, Nystrom, 
Scheiter, Gerjets, et al., 2012; Seppanen & Gegenfurtner, 2012). It remains to be seen in future research if, 
and how, a similar approach can be developed for functional imaging. We should note, however, that 
cognitive-neurosciences are useful methods in addition to instructional design studies: while design studies 
reveal what works, neural correlates can indicate why it works (Gegenfurtner et al., 2013; Kok, Van Geel, 
Van Merrienboer, & Robben, 2017). Finally, EEG and fMRI are measures into the temporal and spatial 
configurations of visual perceptual expertise. These measures should be incorporated into existing theory 
frameworks of visual perceptual expertise to advance our conceptual understanding of how experts, 
intermediates, or novices comprehend medical visualizations. 


6. Conclusion 

As noted at the outset, if we assume that individual differences in visual expertise are reflected in 
differences in the brain, then cognitive neuroscience methods can be used to examine the neural correlates of 
the experts’ visual skills. These methods can complement other methodologies interested in how experts in 
medical disciplines form their diagnoses. This review summarized research on visual perceptual expertise 
and described which research questions were typically asked, which stimuli and functional neuroimaging 
methods were frequently used, and how experts and novices differ in their neural representations (e.g., with 
respect to activation within the FFA). We also outlined some of the benefits, limitations, and future 
directions of cognitive-neuroscience research as they apply to the comprehension of (medical) visualizations. 
This methodological review closes with the hope that interested researchers, who are perhaps yet 
inexperienced with cognitive neuroscience, will find this paper a useful introduction into the neural 
correlates of visual expertise. 


Keypoints 

Cognitive neuroscience can uncover the neural correlates of visual perceptual expertise 
*■ Electroencephalography can reveal temporal adaptations of expertise (e.g., N170) 
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Functional magnetic resonance imaging can reveal spatial adaptations of expertise (e.g., FFA) 

Cognitive neuroscience examining expertise in medical image diagnosis is promising but still in 
its infancy 

EEG and fMRI can complement and extend each other as wells as other methodologies in 
expertise research 
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